Previous Book Contents Book Index Next

Inside Macintosh: Programming With the Text Encoding Conversion Manager /
Appendix C - Some Character Encodings and Their Common Internet Names


Character Encodings and Their Internet Names

Table C-1 lists character encodings for various languages, gives some of their common Internet names, and identifies the version of the Text Encoding Conversion Manager for which character encoding was first supported for use by the Text Encoding Converter and the Unicode Converter. In the last two columns of the table, ÒN/AÓ means that the encoding is not supported.

Table C-1 Character encoding Internet names and availability in Mac OS
Character encodingCommon Internet namesRelated informationVersion of Text Encoding Conversion Manager that first offered support in:
   Text Encoding ConverterUnicode Converter
Universal
Unicode 2.0 (16 bit)UTF-16  1.21.2
Unicode 2.0 UTF-8UTF-8 1.21.2.1
Unicode 2.0 UTF-7UTF-7 1.2N/A
Unicode 1.1 (16-bit)UNICODE 1-1 1.21.2
Unicode 1.1 UTF-8UNICODE-1-1-UTF-8 1.21.2.1
Unicode 1.1 UTF-7UNICODE-1-1-UTF-7 1.2N/A
     
Western European languages
ASCIIUS-ASCII 1.2.11.2.1
ISO 8859-1 (Latin-1)ISO-8859-1, latin1 1.2.11.2.1
CP 1252 (Windows Latin-1)windows-1252, cp1252ISO 8859-1, plus additions in C1 area1.21.2
CP 437
(DOS Latin-US)
cp437 1.21.2
CP 850
(DOS Latin-1)
cp850 1.41.4
Mac OS Romanmac, macintosh, x-mac-roman 1.21.2
Mac OS Icelandic x-mac-icelandicbased on Mac OS Roman1.21.2
Mac OS Latin-1,
Mac OS Mail
x-mac-latin1
(commonly sent as ISO-8859-1)
Mac OS Roman permuted to align with 8859-11.21.2
NextStep Latin 1.21.2
CP 037 (EBCDIC-US)

 

cp037ISO 8859-1 repertoire, different layout1.2.11.2.1
Arabic
ISO 8859-6
(Latin/Arabic)
ISO-8859-6, arabic 1.21.2
CP 1256
(Windows Arabic)
windows-1256, cp1256Partly 8859-6, plus C1 additions1.21.2
CP 864 (DOS Arabic)cp864Encodes Arabic presentation forms1.21.2
Mac OS Arabic x-mac-arabic 1.21.2
Mac OS Farsi
x-mac-farsi 1.21.2
     
Central European languages
ISO 8859-2 (Latin-2)ISO-8859-2, latin2 1.21.2
CP 1250 (Windows Latin-2)windows-1250, cp 1250Partly 8859-2, plus C1 additions1.21.2
Mac OS Central
European Roman
x-mac-centraleurroman 1.21.2
Mac OS Croatianx-mac-croatianBased on Mac OS Roman1.21.2
Mac OS Romanianx-mac-romanianBased on Mac OS Roman 1.21.2
     
Chinese
GB 2312-80 1.2N/A
EUC-CNGB2312, X-EUC-CNASCII + GB 2312- 80 (8-bit)1.21.2
CP 936
(DOS and Windows Simplified)
Similar to GBK1.41.4
Mac OS
Chinese Simplified
Based on EUC-CN1.21.2
ISO 2022-CN ("GB")ISO-2022-CNASCII +
GB 2312-80 (7-bit)
(see RFC1922)
1.2N/A
HZHZ-GB-2312ASCII + GB 2312-80 (7-bit) (see RFC1842);1.2N/A
GBK (extended GB)EUC-CN + Unihan repertoire (8-bit)1.21.2
CNS 11643 plane 1x-cns11643-1 N/AN/A
CNS 11643 plane 2x-cns11643-2 N/AN/A
EUC-TWX-EUC-TWASCII + CNS 11643-1992 (8-bit)1.21.2
Big-5Big5(8-bit)1.21.2
CP 950
(DOS and Windows Traditional)
Based on Big-51.41.4
     
Mac OS
Chinese Traditional
Based on Big-5 1.21.2
CCCII N/AN/A
EACC N/AN/A
     
Cyrillic
ISO 8859-5
(Latin/Cyrillic)
ISO-8859-5, cyrillic 1.21.2
KOI8-RKOI8-RSee Rfc 14891.21.2
CP 1251
(Windows Cyrillic)
windows-1251, cp1251Not based on ISO 8859-51.21.2
CP 866
(DOS Russian)
cp866 N/AN/A
Mac OS Cyrillicx-mac-cyrillic 1.21.2
Mac OS Ukrainianx-mac-ukrainianMac OS Cyrillic with two replacements1.21.2
     
Greek
ISO 8859-7ISO-8859-7, greek 1.21.2
ISO 5428ISO_5428:1980 N/AN/A
CP 1253
(Windows Greek)
windows-1253, cp1253Nearly 8859-7, plus C1 additions1.21.2
Mac OS Greekx-mac-greek 1.21.2
Greek CCITTgreek-ccitt N/AN/A
     
Hebrew
ISO 8859-8
(Latin/Hebrew)
ISO-8859-8, hebrew 1.21.2
CP 1255
(Windows Hebrew)
windows-1255,cp1255Mostly 8859-8, plus C1 additions1.21.2
Mac OS Hebrew
(2 variants)
x-mac-hebrew 1.21.2
     
Indic
ISCII-91 Parallel encodings for all Indic scriptsN/AN/A
Mac OS Gujarati  1.21.2
Mac OS Devanagari  1.21.2
Mac OS Gurmukhi  1.21.2
     
Japanese
JIS X0208  1.2N/A
JIS X0212  N/AN/A
EUC-JPEUC-JP, X-EUC-JPJIS 201 + JIS 208 + JIS 212 (8-bit)1.21.4
ISO 2022-JP ("JIS")ISO-2022-JPJIS 201 + JIS 208 + JIS 212 (7-bit); Rfc 14681.2N/A
Shift-JIS Shift_JIS, x-sjis, x-shift-jisJIS 201 + JIS 208 (8-bit)1.21.2
CP 932
(DOS + Windows)
 Based on Shift-JIS1.41.4
Mac OS Japanese Based on Shift-JIS 1.21.2
     
Korean
KSC 5601-1987  1.2N/A
EUC-KREUC-KRASCII + KSC 5601-87 (8-bit); Rfc 15571.21.2
CP 949
(DOS + Windows)
 Unified Hangul Code: EUC-KR + JohabN/AN/A
Mac OS Korean Based on EUC-KR 1.21.2
ISO 2022-KR ("KSC")ISO-2022-KRASCII + KSC 5601-87 (7-bit): Rfc 15571.2N/A
KSC 5700  N/AN/A
     
Symbols encoding
Adobe Symbol Adobe-Symbol-Encoding N/AN/A
Mac OS Symbolx-mac-symbolBased on Adobe Symbol1.21.2
Mac OS dingbatsx-mac-dingbatsBased on Adobe Zapf Dingbats1.21.2
     
Thai
TIS 620-2533  N/AN/A
CP 874
(DOS + Windows)
cp874Based on TIS 620-25331.41.4
Mac OS Thaix-mac-thaiBased on TIS 620-25331.21.2
     
Turkish
ISO 8859-9 (Latin-5)ISO-8859, latin5 1.21.2
ISO 8859-3 (Latin-3)ISO-8859-3 N/AN/A
CP 1254
(Windows Latin-5)
windows-1254, cp1254 1.21.2
Mac OS Turkishx-mac-turkishBased on Mac OS Roman1.21.2
     
Vietnamese
VISCIIVISCIIRfc 1456N/A N/A
TCVN-n  N/AN/A


Previous Book Contents Book Index Next

© Apple Computer, Inc.
13 NOV 1997